Acquiring Lexical Knowledge from Text: A Case Study

نویسندگان

  • Paul S. Jacobs
  • Uri Zernik
چکیده

Language acquisition addresses two important text processing issues. The immediate problem is understanding a text in spite of the existence of lexical gaps. The long term issue is that the understander must incorporate new words into its lexicon for future use. This paper describes an approach to constructing new lexical entries in a gradual process by analyzing a sequence of example texts. This approach permits the graceful tolerance of new words while enabling the automated extension of the lexicon. Each new acquired lexeme starts as a set of assumptions derived from the analysis of each word in a textual context. A variety of knowledge sources, including morphological, syntactic, semantic, and contextual knowledge, determine the assumptions. These assumptions, along with justifications and dependencies, are interpreted and refined by a learning program that ultimately updates the system’s lexicon. This approach uses existing linguistic knowledge, and generalization of multiple occurrences, to create new operational lexical entries. Producing an analysis of natural language text, even in a restricted domain, requires a rich and robust lexicon. Developing such a lexicon automatically [Byrd, 1988; Boguraev and Briscoe, 19881 has emerged as one of the immediate challenges for natural language processing, because of the overwhelming difficulty of lexicon construction by hand. The automatic derivation of word meanings also improves both the relevance of the definitions to a particular domain and the internal consistency of the derived lexicon. In this paper we describe a method for acquiring new words from multiple examples in texts. We explain how the programs RINA [Zernik, 19871 and TRUMP [Jacobs, 19871 cooperate in implementing this method, and we give empirical data to motivate the entire approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acquiring Lexical Knowledge for Anaphora Resolution

The lack of adequate bases of commonsense or even lexical knowledge is perhaps the main obstacle to the development of highperformance, robust tools for semantic interpretation. It is also generally accepted that, notwithstanding the increasing availability in recent years of substantial hand-coded lexical resources such as WordNet and EuroWordNet, addressing the commonsense knowledge bottlenec...

متن کامل

The Relationship between Iranian EFL Learners' Reading Comprehension, Vocabulary Size and Lexical Coverage of the Text: The Case of Narrative and Argumentative Genres

This study explored the relationship between EFL learners’ vocabulary size, lexical coverage of the text and reading comprehension texts (narrative & argumentative genres). To this end, 120 male and female out of 180 students studying at Talesh Azad University were selected based on their performance on the Nelson Proficiency Test. A Nelson reading proficiency test was also administered in orde...

متن کامل

Early Phonological and Lexical Development of a Farsi Speaking Child: A Longitudinal Case Study

The present study aims at the description and analysis of the phonological and lexical development of a child who is acquiring Farsi as his first language. The child's language production at the holophrastic stage of language development, mainly single words, is observed and recorded  longitudinally for nearly seven  months since he was 16 months old until he turned 23 months. An attempt is mad...

متن کامل

MindNet: Acquiring and Structuring Semantic Information from Text

As a lexical knowledge base constructed automatically from the definitions and example sentences in two machine-readable dictionaries (MRDs), MindNet embodies several features that distinguish it from prior work with MRDs. It is, however, more than this static resource alone. MindNet represents a general methodology for acquiring, structuring, accessing, and exploiting semantic information from...

متن کامل

Iranian EFL Learners’ Lexical Inferencing Strategies at Both Text and Sentence levels

Lexical inferencing is one of the most important strategies in vocabulary learning and it plays an important role in dealing with unknown words in a text. In this regard, the aim of this study was to determine the lexical inferencing strategies used by Iranian EFL learners when they encounter unknown words at both text and sentence levels. To this end, forty lower intermediate students were div...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1988